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Abstract 

Genetic variation within species is commonly structured in a hierarchical manner which may result from superimposition of 
processes acting at different spatial and temporal scales. In organisms of limited dispersal ability, signatures of past 
subdivision are detectable for a long time. Studies of contemporary genetic structure in such taxa inform about the history 
of isolation, range changes and local admixture resulting from geographically restricted hybridization with related species. 
Here we use a set of 139 transcriptome-derived, unlinked nuclear single nucleotide polymorphisms (SNP) to assess the 
genetic structure of the Carpathian newt {Lissotriton montandoni, Lm) and introgression from its congener, the smooth newt 
(L. vulgaris, Lv). Two substantially differentiated groups of Lm populations likely originated from separate refugia, both 
located in the Eastern Carpathians. The colonization of the present range in north-western and south-western directions 
was accompanied by a modest loss of variation; admixture between the two groups has occurred in the middle of the 
Eastern Carpathians. Local, apparently recent introgression of Lv alleles into several Lm populations was detected, 
demonstrating increased power for admixture detection in comparison to a previous study based on a limited number of 
microsatellite markers. The level of introgression was higher in Lm populations classified as admixed than in syntopic 
populations. We discuss the possible causes and propose further tests to distinguish between alternatives. Several outlier 
loci were identified in tests of interspecific differentiation, suggesting genomic heterogeneity of gene flow between species. 
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Introduction 

Most species are genetically structured, and genetic structure is 
often observed at multiple spatial scales [1,2]. Genetic structure is 
the result of a complex interplay of drift, gene flow, and natural 
selection acting on standing variation and new mutations [3-5]. 
The relative importance of these evolutionary forces is contingent 
on biological features of the organisms [6], and has also been 
affected by large-scale historical events, such as the Pleistocene 
climatic oscillations [2,7]. Identification of factors responsible for 
the observed spatial structuring of genetic diversity is a major goal 
of population genetics [8]. The quantification and understanding 
of genetic structure within species are of fundamental importance 
for inferential studies of population history, population ecology 
and biodiversity conservation [3,9,10]. Analyses of genetic 
structure are also essential for several aspects of the study of 
adaptation [5,11-18]. 

Genetic variation within species is commonly structured in a 
hierarchical manner which may result from superimposition of 



processes acting at different spatial and temporal scales. For 
example the impact of major climatic oscillations is clearly visible 
in the patterns of genetic differentiation observed currently in 
temperate and boreal species [2,19]. This is believed to reflect 
mainly secondary contact and partial admixture of populations 
derived from separate refugia with a contribution of processes 
related to the expansion itself, such as allele surfing [4,7]. Within 
these major geographic groups, populations are differentiated due 
to limited dispersal producing isolation by distance [20,21]. 

In species with limited dispersal capabilities, signatures of past 
subdivision are detectable for a long time [2,22]. This may be due 
to a combination of limited dispersal per se and differential 
adaptation in refugia superimposed on contemporary ecological 
gradients [23]. Admixture may also be delayed or prevented by 
the accumulation of intrinsic incompatibilities between popula- 
tions [24,25] and poor dispersers appear to speciate on smaller 
geographic scales [26]. Thus studying contemporary genetic 
structure in taxa characterized by limited dispersal is likely to 
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provide ample information about historical and demographic 
events. Amphibians and in particular salamanders are ideal for 
such inferences [27,28]. Another advantage of such taxa is that 
they retain historical information about spatial variation of genetic 
exchange with related, incompletely reproductively isolated species 
([29] and references therein). 

Detecting, quantifying and interpreting genetic structure 
requires appropriate tools. Single nucleotide polymorphisms 
(SNPs) are powerful markers well suited for assessing genetic 
structure [30,31]. They are amendable to high throughput, cost- 
effective and reliable genotyping through array-based [32] and 
genotyping by sequencing [33] approaches. If SNP discovery and 
genotyping are performed separately, the researcher has control 
over the location and other characteristics of the SNPs selected for 
genotyping. For instance, if polymorphism data from transcrip- 
tome sequencing are available, SNPs located in known protein 
coding genes can be selected for genotyping, providing informa- 
tion about variation in functionally important regions. On the 
other hand, a discover-then-genotype approach introduces ascer- 
tainment bias, which distorts the picture of variation obtained 
from a larger sample [34] . However, initial discovery of SNPs in a 
smaller sample may be desirable for some applications, such as 
detection and quantification of population structure [35,36]. This 
is because the discovery process is biased towards more variable 
SNPs, thus increasing the per-marker information content, 
especially if SNP discovery is performed in a random or 
geographically diverse sample [37]. A distinct advantage of SNPs 
over microsatellites is that orders of magnitude more locations in 
the genome, can be easily interrogated. Thus SNPs offer a truly 
genome-wide perspective, essential if the biological processes of 
interest affect portions of the genome differentially [38-40]. 

Here we investigate the genetic structure of populations of the 
Carpathian newt (Lis.mtrihm montandoni, Lm), a species which has 
apparently survived the glacial period in the Carpathians [41], an 
important refugial area [42,43]. Two processes appear to have 
profoundly affected this species and shaped the genetic structure 
currently observed. The first includes climatic oscillations during 
the Pleistocene, likely responsible for the observed regional-scale 
genetic structuring. The second involves hybridization with and 
introgression from its widely distributed congener, the smooth 
newt [L. vulgaris, Lv). Previous studies [29,41,44,45] demonstrated 
that Lm is genetically differentiated across its range in both 
mitochondrial and nuclear (microsatellites) genome. Patterns of 
genetic differentiation and species distribution modeling per- 
formed by Zieliriski et al. [29] suggest several glacial refugia in the 
Carpathians. While multiple, spatially and temporally distinct 
introgression events from Lo resulted in complete mtDNA 
replacement in Lm, very little recent interspecific nuclear gene 
flow was suggested by microsateUite markers [29]. However, 
interspecific gene flow in some parts of the nuclear genome has 
been extensive, as evidenced by data from the Major Histocom- 
patibility Complex genes [45]. 

A set of transcriptome-derived SNPs and extensive sampling are 
used herein to address the following issues. First, we compare 
genetic structure inferred from the genome-wide sample of SNP 
markers with that estimated previously [29] from a much smaller 
number of microsatellites. Specifically we wanted to determine the 
number of genetic clusters (which may correspond to glacial 
refugia) supported by SNP markers, delineate their distribution 
and estimate the extent of admixture between them. To this end 
we use a comprehensive, uniform sampling, including previously 
undersampled Ukrainian Carpathians where admixture between 
genetic clusters was expected. Second, we test whether introgres- 
sion from Lj] is detectable in the nuclear genome of Lm with an 



increased number of markers, and if so, whether introgression 
varies geographically. Populations in which both species co-occur 
were also sampled across the range to estimate the admixture in 
syntopy. Third, we apply outiier analysis to test heterogeneity of 
gene flow within and between species and identify genes departing 
from the genomic average; such genes may be involved in 
population or species-specific adaptations. 

Material and Methods 

Sample collection 

Altogether we analyzed 473 individuals from 40 populations: 25 
populations of Lm (298 individuals), 7 syntopic populations in 
which both species co-occur (83 individuals) and 8 populations of 
Lj) (92 individuals) (Fig. 1; Table 1). Sampling sites were selected to 
cover the Lm range uniformly and to reflect Lv diversit)' in the 
surrounding areas. The average per population sample size, 12, 
miglit h(; considered low for some of the population genetic 
analyses, however we decided to rather maximize the number of 
markers as it was shown that this might be beneficial for robust 
landscape genetic inferences [46]. Throughout the text, we use 
terms population or locality interchangeably to refer to a 
particular breeding site consisting of one or more closely located 
water bodies. Adult newts were sampled by dip-netting during 
breeding season. Animals were released after tailtips were 
collected. Tissue samples were stored in 95% ethanol until DNA 
extraction. DNA was extracted using the Wizard Genomic DNA 
Purification Kit (Promega). 

Ethics statement 

Animal samples were collected under permits: DOPozgiz-4200/ 
11-78/3702/ 10/JRO provided by tiie Polish General Director for 
Environmental Protection, 03.04.12 No. 67 provided by the 
National Academy of Sciences of Ukraine and 3256/9.07.2010 
provided by the Romanian Commission for Protection of Natural 
Monuments. Samples were collected with institutional animal 
ethics approval (number 101/2009), issued by the First Local 
Ethical Committee on Animal Testing at the Jagiellonian 
University in Krakow. Tissue samples were collected according 
to the rerjuirements of these institutions: adult newts were captured 
by dip netting and tissue samples from tail tips were taken under 
anesthesia. Immediately after recovery from anesthesia, newts 
were released at the collection site. The sampling locations were 
not privately owned or protected in any way. 

SNP discovery, assay development, and genotyping 

SNPs were identified in liver transcriptomes of six Lm 
individuals sampled to encompass the genetic diversity of the 
spec:ies (Fig. 1). Transcriptomes were sequenced using lUumina 
technology and de novo assembled with Trinity [47]. Details of 
transcriptome serjuencing and assembly will be provided elsewhere 
(Stuglik et al. in prep). We used a custom bioinformatic pipeline 
[48] to construct transcriptome-based gene models (TGM) from 
the Trinity output. Reads were mapped to this reference 
transcriptome with Bowtie2 [49] and SNPs were called with 
SAMTools 0.1.18. [50]. Next, we used blast searches against 
Xmopus tropicalis transcripts to identify TGM representing protein 
coding genes. To be included in the design of the genotyping 
assay, the SNP had to fulfill the following criteria: i) occur in a 
TGM which produced an unambiguous hit to a single Xmopus 
gene and to not exhibit high similarity to other TGMs in the newt 
reference transcriptome; this criterion was applied to minimize the 
incidence of false "SNPs" derived from paralogous regions; ii) 
have a minimum sequencing depth of 15 x and minimum 
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Figure 1. The distribution of sampling localities (details in Table 1). Red triangles — Lissotriton montandoni (Lm); green circles — L vulgaris 
(Lv); two symbols superimposed — syntopic locality where both species co-occur; T - localities from which six Lm individuals were sampled for liver 
transcriptomes. The distribution of Lm (Zavadil et al. 2003 and own unpublished data) is hatched. Areas above 500 m a.s.l. are shaded. 
doi:1 0.1 371 /journal.pone.0097431 .gOOl 



genotype quality of 30 phred; iii) be located at least 60 but not 
more than 1000 bp from the exon boundary; the latter filter was 
used because last exons of many genes are long and consist mostly 
of 3' untranslated regions (UTR) which are poorly conserved 
between species; thus the length of such exons in the newt could 
not be reliably determined and particularly long last "exons" may 
be artifacts of misassembly [51]. Filtering was performed with a 
custom Python script. A total of 251 SNPs and their flanking 
sequences were scored with lUumina Assay Design Tool (ADT) 
and the lUumina VeraCode GoldenGate Assay was designed for 
192 best scoring SNPs. GoldenGate provides codominant 
genotype data for polymorphic positions with two segregating 
variants [32]. Genotyping, primary visualization, quality assess- 
ment and filtering were performed with lUumina GenomeStudio 
Data Analysis Software. AH loci with cluster separation score and 
gen train score lower than 0.2 and 0.7, respectively, were excluded 
from further analysis. We also excluded loci with minor allele 
frequency (MAF)<1% and less than 90% genotyped individuals. 

Population genetics analyses 

AUele frequencies for each locus, tests of Hardy-Weinberg 
equilibrium and tests of linkage disequilibrium (LD) were 
calculated in GENEPOP 4.1.2 [52]; the type I error was 
controlled using the false discovery rate (FDR) approach 
implemented in QVALUE 1.0 [53,54]. Expected heterozygosity 
(//e), was calculated in the R package adegenet [55]. Allelic 
richness {AR) was calculated in FSTAT 2.9.3.2 [56]. We 
interpolated geographic gradients in //e and AR using inverse 
distance weighting (IDW) in ArcGIS v 10.0 (ESRI, Redlands, CA, 
USA). Pairwise F^t values between populations and their 
significance were computed in Arlequin 3.5 [57]. Multidimen- 
sional scaling (MDS) was used for visualization of the F^t matrix. 
Principal component analysis (PCA) was performed in R using 



adegenet. The significance of correlations between genetic and 
geographical distances was calculated using the Mantel test 
implemented in IBDWS [58] . A population tree was constructed 
in POPTREE2 [59] from the pairwise F^t [60] matrix using the 
neighbor-joining method. The number of genetic clusters was 
determined and assignment of individuals to clusters was 
performed using the Bayesian approach implemented in STRUC- 
TURE 2.3.3 [61-63]. We ran Structure on two separate datasets: 
Lm, which included only morphologically pure Lm populations and 
Lm&Lv comprising Lm, Lv and syntopic populations. We ran 10 
analyses for each A" 1 - 1 5 for Lm and 1 0 replicate runs for each Kl— 
20 for LmSiLv dataset. In each case, the admixture model was 
applied and the runs consisted of 250 000 MCMC burnin steps 
followed by 1 000 000 post-burnin iterations. We performed 
inferences under the model of correlated allele frequencies for Lm, 
whereas the uncorrelated model was used for LmSiLv dataset, 
because Lm and Lv populations were expected to be more 
divergent on average. To determine the most likely number of 
genetic clusters supported by the data, we calculated AR] a 
measure of second order rate of change in the likelihood of data 
[64] , using the online software Structure Harvester [65] . Analysis 
of molecular variance (AMOVA) in Arlequin was used to partition 
SNP variation into hierarchical levels. Two groupings of 
populations were used: i) suggested by Structure; ii) supported 
by the methods based on genetic distances between populations. 
Significance levels for variance components were estimated using 
10 000 permutations. 

To identify markers departing significantly from the genome- 
wide average of differentiation among populations a scan for F^t 
outliers was performed. In order to minimize the number of false 
positives the oudier detection was performed under a hierarchical 
island model [66] in Arlequin. We performed separate scans for 
differentiation within Lm and for interspecific differentiation. In 
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Table 1. Sampling. 





Number 


Locality 


Country 


N individuals 


Species 


Longitude E 


Latitude N 


1 


Jesenil<i 


CZ 


12 


Lm 


17.31 


50.07 


2 


Sihelne 


SK 


12 


Lm 


19.39 


49.51 


3 


topuszna 


PL 


14 


Lm 


20.14 


49.51 


4 


Krempna 


PL 


12 


Lm 


21.48 


49.47 


5 


Smerel< 


PL 


12 


Lm 


22.44 


49.16 


6 


Rakovets 


UA 


12 


Lm 


24.04 


49.63 


7 


Pereprostynya 


UA 


12 


Lm 


23.36 


49.21 


8 


Zbyny 


UA 


12 


Lm 


22.95 


48.80 


9 


Vyshl<ivsky Pass 


UA 


12 


Lm 


23.63 


48.70 


10 


Mykulychyn 


UA 


12 


Lm 


24.59 


48.38 


11 


Dzembronia 


UA 


9 


Lm 


24.62 


48.07 


12 


Sucevita 


RO 


12 


Lm 


25.68 


47.75 


13 


Pasul Gutai 


RO 


12 


Lm 


23.78 


47.70 


14 


Pasul Pascanu 


RO 


12 


Lm 


25.52 


47.57 


15 


Romuli 


RO 


12 


Lm 


24.39 


47.51 


16 


Lunca Le§ului 


RO 


12 


Lm 


24.86 


47.30 


17 


Holda 


RO 


12 


Lm 


25.67 


47.27 


18 


Cuejdiu 


RO 


12 


Lm 


26.27 


47.00 


19 


Lacu Ro^u 


RO 


12 


Lm 


25.77 


46.78 


20 


Brade^elu 


RO 


12 


Lm 


25.13 


46.76 


21 


Bolatau 


RO 


12 


Lm 


26.38 


46.64 


22 


Pasul Musat 1 


RO 


12 


Lm 


26.40 


45.96 


23 


Sacele 


RO 


12 


Lm 


25.82 


45.55 


24 


Predeal 


RO 


11 


Lm 


25.56 


45.49 


25 


Voina 


RO 


12 


Lm 


25.05 


45.44 


26 


Kuzmina 


PL 


11 


Lm&Lv 


22.43 


49.62 


27 


Bronytsya 1 


UA 


12 


Lm&Lv 


23.41 


49.43 


28 


Lyucha 


UA 


12 


Lm&Lv 


24.91 


48.38 


29 


Lypcha 


UA 


12 


Lm&Lv 


23.38 


48.28 


30 


Valea Uzului 


RO 


12 


Lm&Lv 


26.30 


46.34 


31 


Penteleu 


RO 


12 


Lm&Lv 


26.35 


45.61 


32 


Lere^ti 


RO 


12 


Lm&Lv 


25.06 


45.37 


33 


Pokrzywna 


PL 


11 


Lw 


17.45 


50.29 


34 


Jasto 


PL 


12 


Lw 


21.49 


49.74 


35 


Lipni'ky 


SK 


13 


Lw 


21.42 


49.07 


36 


Rosilna 


UA 


8 


Lw 


24.34 


48.80 


37 


Dertsen 


UA 


12 


Lva 


22.65 


48.35 


38 


Stramba 


RO 


12 


Lw 


24.65 


47.25 


39 


Tazlau 


RO 


12 


Lw 


26.47 


46.73 


40 


Pasul Musat 2 


RO 


12 


Lw 


26.40 


45.97 



Country abbreviations: CZ - Czech Republic, PL - Poland, RO - Romania, SK - Slovakia, UA - Ukraine. Species and subspecies abbreviations: Lm - Lissotriton montandoni, 
Lva ~ L. vulgaris ampelensis, Lw - L. v. vulgaris, Lm&Lv - syntopic populations. 
dol:1 0.1 371 /journal.pone.0097431 .tool 



each case 50 000 coalesceiit simulations with 2 groups of 100 
demes were performed to obtain the nuU distribution of F- 
statistics. We selected candidate loci based on Fst or Fq-y values 
falling into the 1% upper and lower quantile as suggested by 
ExcofTier et al. [67]. Fct allows for identification of outlier loci 
between the groups of populations whereas Fgx identifies oudier 
loci among populations after accounting for higher-level structure 



[67]. Genes containing outliers were annotated by similarity blastx 
search against the nr protein database. 

Results 

Variation 

Genotyping was successful for 139 out of 192 markers (72%) 
and these were used for population genetic analyses (Table SI) 
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(DRYAD entry: doi:10.5061/diyad.21 Ick). The proportion of 
missing data among 473 genotyped individuals was very low (< 
0.2% single-locus genotypes). AH 139 markers were polymorphic 
in Lm and 1 12 (81.6%) in Lo (Table SI). No significant deviations 
from Hardy-Weinberg expectations were detected at the FDR 
0.05 which indicates that nuU alleles are very rare in our markers 
(Table S2). Tests for linkage disequihbrium across populations 
detected significant LD at the FDR 0.05 for 12 pairs of loci, 
however significant results were found only in three syntopic 
populations. Thus significant LD resulted not from physical 
hnkage but from local admixture, and we consider all markers as 
segregating independently. 

The expected heterozygosity (H^ ranged from 0.05 (locality 33, 
hj) to 0.29 (locality 29, syntopic) with a mean of 0.19 (SD = 0.07). 

was significandy higher in Lm than Lv (f/-test, -P = 2.9xl0 ') 
most likely due to ascertainment bias, as only markers known to be 
polymorphic in Lm were assayed. Within Lm was lowest in 
population 6, isolated in the Ukrainian Podolian Upland and 
highest in population 20 located in the Romanian Transylvanian 
Plateau, and ranged from 0.19 to 0.27 respectively, with a mean of 
0.23 (SD = 0.01) (Fig. SI). Within Lv He. ranged from 0.05 in the 
northernmost locality 33 to 0. 10 in locality 36 with a mean of 0.08 
(SD = 0.01). He in syntopic populations spanned a broad range 
from very low (0.06, locality 31) to the highest overall (0.29, 
locality 29), most likely depending on the frequency of both species 
in the population (Fig. Sib). 

Genetic structure and diversity in Lissotriton montandoni 

Genetic differentiation between Lm populations varied from 
negligible (^;st<0.001 P=0.39 between 14 and 17 located in the 
northern part of the Romanian Carpathians), to strong 
(Fsx = 0.408 P<10~'^ between populations 1 and 24 at the 
opposite limits of the species distribution) (Table S3). The MDS 
plot of pairwise i^sT revealed two major, genetically differentiated 
groups of populations with distinct geographic distributions in the 
northern and southern part of the species range (Fig. 2a). Pairwise 
FsT values within groups were similar (averages of 0. 100 and 0.098 
in the northern and southern group, respectively), and overlapped 
only shghtly with the distribution of pairwise between 
populations from diflFerent groups (mean 0.271; randomization 
test P<0.001; Fig. S2). Within the northern group two populations 
appeared distinct from the rest. Notably both are isolated from the 
continuous part of the species range (Fig. 1). The westernmost 
locality 1 in the Sudetes Mountains is separated from the main 
range by the Moravian Gate and locality 6 in the Ukrainian 
Podolian Upland by the Dniester river. The two groups of Lm 
populations are strongly supported also by the population tree 
(Fig. 3). 

The presence of two genetic clusters is also evident from 
individual-based analyses. 

In principal component analysis (PCA) PCI (15.8% of variance 
explained) separated newts from northern and southern popula- 
tions (Fig. 2b). The Evanno [64] method also supported K= 2 as 
the most likely number of clusters in the Structure analysis (Fig. 
S3). Structure detected some admixture between the two clusters 
in the Ukrainian Carpathians and the northern part of the 
Romanian Carpathians. Admixture was strongest in population 10 
which was therefore excluded from the AMOVA analysis (Fig. 4). 
AMOVA attributed 19.5% of total variation to differentiation 
between clusters and 8.2% to differentiation between populations 
within clusters (Table 2). Whereas no alleles were private to any 
population, 8 and 20 alleles were private for the northern and 
southern group, respectively. No significant differences between 
groups were detected in H^ [U-iest, P=0.37), but allelic richness 




~i — 

-10 



-5 



1 

0 

PCI (15.8%) 



~r~ 

10 



Figure 2. Genetic differentiation within L. montandoni. (a) Non- 
metric two-dimensional scaling of the pairwise F^j matrix; orange - 
populations from the northern group; blue - populations from the 
southern group; (b) Principal Component Analysis (PCA) performed on 
individual genotypes; in parentheses percentage of variance explained 
by principal components; orange - individuals from the northern 
group; blue - individuals from the southern group. 
doi:10.1371/journal.pone.0097431.g002 

was higher in the southern group (1.62 vs, 1.57; f/-test, P= 0.0042) 
(Fig. S4). As could have been expected from the significant among- 
population differentiation, there was a strong, highly significant 
isolation by distance observed both at the level of the entire species 
(Mantel test, ! = 0.89, /'<0.001) and within genetic clusters (North: 
) = 0.78, P<0.001; South: r=0.78, P<0.001) (Fig. S5). 

Differentiation within Lissotriton vulgaris 

Genetic structure among Lv populations around the Car- 
pathians is stronger and presumably deeper than that within Lm. A 
deep split between populations outside the Carpathian belt and 
those in the Carpathian Basin is visible in the population tree 
(Fig. 3), MDS (Fig. 5a) and PCA plot (PC3, Fig. 5c). Structuring 
within the Carpathian Basin is also pronounced, as substantial 
genetic distance separates the single analysed population of L. 
vulgaris ampelemis (locality 38) from two populations of the nominal 
subspecies L. vulgaris vulgaris (Fig. 3, 5a, 5c). 
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0.05 



Figure 3. Relationships among populations. A neighbor-joining 
tree was constructed from the matrix of pairwise F^Y, syntopic 
populations were excluded. Robustness of relationships was tested 
with 1000 bootstrap replicates. Red - L montandoni: orange - northern, 
blue - southern group; green - L vulgaris: yellow - populations in the 
Carpathian Basin, violet - populations outside the Carpathian Basin. 
doi:1 0.1 371 /journal.pone.0097431 .g003 

Genetic differentiation and gene flow between species 

Strong difFerentiation between Lm and Lv was detected by the 
population-based analyses (Table S3, Fig. 3, 5a). AMOVA 
revealed that 43.5% variation was distributed between species 
and 13.3% between populations within species (Table 2). Three of 
seven syntopic populations occupied intermediate positions in the 
MDS plot (Fig. 5a). However, an overwhelming majority of newts 
in syntopic populations fell within the range of variation of either 
one or the other species; only a handful of substantially admixed 
individuals and possibly a single Fl hybrid were detected (Fig. 5b, 
6). K= 2 was strongly supported by Structure when the two species 
were analyzed together (Fig. 6, S6). Structure confirmed that in 
syntopic populations admixture is limited, genotypes of two 
parental species dominate and significantly admixed individuals 



are rare. Structure also provided an important insight which was 
not visible in PCA results: a clearly detectable (>3%) admixture of 
Lv genes in four Lm populations. Three of these were in the 
southern part of the Romanian Eastern Carpathians (localities 19, 
20, 25) and one was the isolated locality 1 in the Sudetes 
Mountains; the average admixture in these populations was 8.5%. 
No admixture of Lm genes was detected in Lv populations. The 
comparison of the average proportion of admixture on both 
genetic backgrounds in syntopic populations demonstrated that 
the mean admixture was very low, ca. 2 % and that the proportion 
of admixture did not differ (U-test, P= 0.24) between Lv and Lm 
backgrounds. 

Detection of outliers 

The scan for outliers performed in Lm (locality 10 excluded, see 
above) revealed 14 F^r outliers (10.0%) at the significance level of 
0.01: 9 loci (9, 17, 25, 28, 34, 45, 57, 73, 137) showed an excess of 
difFerentiation among populations (candidates for local adapta- 
tions) and 5 (4, 6, 22, 84, 93) loci were less differentiated than 
expected under neutrality (candidates for balancing selection) 
(Fig. 7a). Three i^cT outliers (45,73,75) were identified as 
candidates for diversifying selection between the northern and 
southern Lm groups and two (72, 89) as candidates for balancing 
selection (Fig. 7b). 

Screening for oudier loci between Lm and Lv was performed 
excluding syntopic populations (26-32). A total of nine (6.5%) Fct 
outlier loci were identified: eight (15, 72, 79, 112, 116, 117, 128, 
129) were more differentiated than expected under the neutral 
model and one (126) was less differentiated (Fig. 7c). Only one of 
the interspecific oudiers was a nonsynonymous polymorphism. 
Locus 72 located within gene SRSFl, involved in splicing 
regulation, was classified as a candidate for balancing selection 
between the northern and southern Lm groups and at the same 
time as a candidate for divergent selection between species. 

Discussion 

Isolation in glacial refugia and limited dispersal 
determine the genetic structure of Lm 

Two clearly difiFerentiated genetic units were identified in Lm by 
SNP data: the northern group in the Western Carpathians and the 
western part of the Eastern Carpathians, and the southern group 
across the rest of the species range. Admixture between them 
occurs around the Romanian-Ukrainian border. Zielihski et al. 
[29] identified three units in microsatellite data: our southern 
group combines their eastern and southern units. Can the 
discrepancy between the two studies be reconciled? Below we 
argue that SNPs reflect the history and difFerentiation better than 
microsatellites and offer several explanations of the apparent 
discrepancy between the datasets. We consider the alternative 
explanation that SNP data fail to detect true difFerentiation 
unlikely on several grounds. First, an overwhelming majority of 
pairwise F^t values calculated from SNPs were significant, 
demonstrating substantial power to detect differentiation. Second, 
remarkably strong isolation by distance is observed in Lm and the 
apparent break between the eastern and southern groups of 
Zielinski et al. [29] coincides with a gap in their sampling. This 
gap has been filled by the present study. Thus, isolation by 
distance and non-uniform sampling may have resulted in 
delineation of the apparently distinct unit [68,69]. Third, the 
population tree based on SNPs shows a remarkable pattern 
consistent with colonization from two refugia, but it is difficult to 
explain under the assumption of expansion from three refugia. 
While in both groups relationships between some populations are 
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Figure 4. Genetic structure of L. montandoni inferred by Structure for K=2 groups. For each population pie charts show the fraction of the 
genes from the northern (orange) and southern (blue) groups. 
doi:1 0.1 371 /journal.pone.0097431 .g004 



poorly resolved, in each group several populations are related in a 
nested fashion, with progressively longer branches at the deeper 
nesting levels. Nesting involves populations distributed from the 
center to the periphery of the species range - to the west in the 
Western Carpathians and to the south-west in the Romanian 
Carpathians. We hypothesize that the populations with poorly 
resolved relationships are those inhabiting the refugial areas and 
sharing most variation retained there. Populations related to each 

Table 2. Results of the Analysis of Molecular variance (AMOVA). 



other in a nested fashion would be those which colonized the 
present range through serial events [4,70]. Taking the evidence 
together, we propose that the location of the refugium for the 
northern group was in the Eastern Carpathians close to the Polish- 
Ukrainian border, and that the refugium for the southern group 
was in the central part of the Eastern Carpathians in Romania, 
Species distribution models for the LGM reported by Zielihski et 
al. [29] are broadly consistent with the proposed location of 





Source of variation 


d.f. 


Sum of squares 


Percentage of variation explained 


P 


Two groups within L. montandoni 


Among groups 


1 


1260.054 


19.52 


<0.0001 


Among populations within groups 


22 


1330.68 


8.16 


<0.0001 


Within populations 


548 


8988.36 


72.32 


<0.0001 


Two species 


Among groups 


1 


4019.136 


43.47 


<0.0001 


Among populations within groups 


31 


3545.323 


13.31 


<0.0001 


Within populations 


747 


10316.708 


43.22 


<0.0001 



AMOVAs were performed for: 1) two groups within L. montandoni; ii) two species (excluding syntopic populations). 
doi:10.1371/journal.pone.0097431.t002 
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Figure 5. Genetic differentiation between L. montandoni (Lm) 
and L. vulgaris (Lv\, (a) Non-metric two-dimensional scaling of the 
matrix of pairwise Fsj between populations; red triangles - Lw; green 
circles - Lv; grey diamonds - syntopic populations; (b) and (c) Principal 



Component Analysis (PCA) performed on individual genotypes; in 
parentheses percentage of variance explained by principal compo- 
nents; red triangles - Lm; green circles - Lv; grey diamonds - individuals 
from syntopic populations. 
doi:10.1371/journal.pone.0097431.g005 



refugia. The role of the Carpathians as a major refugium for 
European biota has recently been well documented [42,43,71,72]. 
Multiple species show genetic differentiation between the Western, 
Eastern and Southern Carpathians, pointing to the presence of 
several refugia (reviewed in [29,72]). So far, a refugium in the 
western part of the Eastern Carpathians has to our knowledge not 
been proposed. 

Expansion from refugia is commonly accompanied by loss of 
variation [4,73]. Reduction of genetic variation along the 
postulated expansion routes is visible in our data, but the signal 
is not strong, and may be distorted locally by rntrogression from Lo 
(see below). Hence expansion was apparently not accompanied by 
severe bottlenecks and thus only a minor fraction of variation has 
been lost. The strongest reduction in genetic variation occurred in 
locality 6 in the Podolian Upland isolated from the main portion of 
the range. This population is a remnant of a geographically remote 
group of Lm populations [74] which has been hypothesized to be 
isolated from the main part of the range for several thousand years 
[75]. 

Salamanders often exhibit low individual mobility and strong 
phUopatry [27,76]. Genetic differentiation between salamander 
populations appears to reflect these features although the 
geographic scale of subdivision differs among species [77,78], 
which may be related to habitat characteristics [79,80] and to life- 
history traits [81]. In continuous habitats limited dispersal abilities 
are likely to generate isolation by distance patterns with a gradient 
of genetic differentiation among sites, on which larger-scale, 
hierarchical differentiation reflecting geographic or environmental 
barriers may be superimposed [80,82]. A comparison of our 
results with those of a study [83] on fme-scale genetic differen- 
tiation in L. vulgaris graecus suggests that a combination of isolation 
by distance, probably due to limited dispersal, and spatial 
clustering due to historical fragmentation and/or landscape 
barriers occurs in Lissotriton newts at both micro- and macroscales. 

Local introgression of Lv alleles into the Lm nuclear 
genome is detectable with SNP markers 

A major finding of the present study is substantial introgression 
of Lv nuclear alleles into some Lm populations. This is contrary to 
the findings of Zielinski et al. [29] who detected very little recent 
nuclear introgression in either direction. One likely explanation for 
the difference between the studies is the number of markers 
employed [84]. While we analyzed 139 unlinked SNPs, inference 
about introgression in the previous study was based on only 10 
microsatellites. The observed discrepancy does not result from 
differences in sampling because three of four admixed populations 
were analysed in both studies. As SNP markers were discovered in 
a sample of Lm individuals, our study did not use diagnostic 
markers. This could be considered a weakness if viewed from the 
perspective of classical studies of hybrid zones which usually 
employed a limited number of diagnostic markers. However, 
because of the widespread genomic heterogeneity of interspecific 
gene flow [38,40,85,86], such diagnostic markers may constitute a 
highly nonrandom sample of the genome, enriched in genomic 
regions strongly differentiated between species. In our opinion 
randomly selected polymorphisms are better suited for an 
vmbiased assessment of introgression. We acknowledge that ideally 
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Figure 6. Genetic differentiation and admixture between L. montandoni and L vulgaris inferred by Structure for K= 2 groups. For each 
population pie charts show the fraction of L montandoni (red) and L vulgaris (green) genes. 
doi:1 0.1 371 /journal.pone.0097431 .g006 



both species .should be included in the discovery panel; this would 
however limit the number of polymorphic loci useful for the 
assessment of genetic structure within Lm. The current study 
demonstrates two peculiar features of Lm x Lv hybridization. First, 
appreciable (>3%) introgression was detectable only locally, in 
four of 25 sampled Lm localities. In these populations most 
individuals were introgressed and the average admixture of Lv 
genes was 8.5%. Second, in the introgressed Lm populations, 
admixture was stronger than in seven syntopic localities, where it 
was barely detectable. Thus current syntopy, even if it leads to 
occasional hybridization, as shown by a single putative Fl hybrid, 
does not necessarily cause introgression. This is somewhat 
surprising because a study of a Lm/ Lv hybrid zone at microscale 
detected strong assortative mating but also found that syntopy was 
almost universally accompanied by some admixture [44]. As the 
four admixed Lm populations testify, nuclear introgression of Lv 
alleles into Lm populations extends beyond syntopy, but does not 
permeate into the core of the Lm range. 

Local differences in the extent of introgression may be explained 
by several mechanisms. The introgressed populations may be 
simply located at the tails of local hybrid zones, and would thus be 
sampled entirely by chance. However other potential explanations 
deserve consideration. Local ecological conditions may either 
favor introgression or delay removal of introgressed alleles by 
selection [87]. Differences in abundance of species in a breeding 
locality may force the rarer species to hybridize due to scarcity of 



conspecific mates, but we have not observed this effect in syntopic 
populations. If species are genetically structured, as in our case, 
introgression may be easier between some genetic groups if their 
genomes harbor fewer incompatible alleles and thus intrinsic 
selection against hybrids is weaker or ecological/sexual adapta- 
tions are similar [24,25]. Lv is strongly differentiated genetically 
[41,88,89] and various Lv groups come into contact with Lm 
populations in the Carpathian Basin, and outside the Carpathian 
belt. If introgression is neutral, the observed pattern may result 
from expansion-related phenomena [4]. Under the scenario 
modeled by Currat et al. [90], when one species invades the 
range of another, neutral introgression occurs almost exclusively 
from the resident to the invading species. Thus, local expansion of 
Lm would bring Lv genes onto its genetic background. A 
comparison of the two isolated Lm populations may be instructive 
in this respect. Population 1 at the western margin of the species 
range, probably the result of postglacial or more recent expansion, 
has recendy introgressed Lv mtDNA and shows clear evidence of 
nuclear introgression. Another isolated population (6), close to the 
postulated refugial area of the northern Lm group and possibly 
surviving in situ for a long time, shows no trace of nuclear 
introgression. Scenarios related to the Currat et al. [90] model 
were favored as the explanation of mtDNA introgression and 
replacement in Lm by Zielinski et al. [29]. 

In addition to laboratory experiments which are difficult to 
perform in this system due to logistic reasons, two other kinds of 
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Figure 7. Detection of outlier loci from genome scans, (a) F^j 

outliers in L. montandoni, (b) Fct outliers in L. montandoni, (c) Fct 
outliers in interspecific analysis - markers at the extremes of 
interspecific differentiation. 
doi:10.1371/journal.pone.0097431.g007 

analyses would be informative with respect to the causes of the 
apparent differentiation in the extent of introgression. Examina- 
tion of several transects through hybrid zones in the context of 
local environmental conditions and relative species abundance 
could be informative as demonstrated in multiple systems [87,91- 
93]. Another important way forward would be to use multUocus 
sequence data [94] to construct and test multipopulation models of 
gene flow between Lm and Lv. Models distinguishing two groups 
within Lv, inside and outside of the Carpathian basin, as well as 
two groups within Lm can be evaluated and hypotheses regarding 
the timing and extent of gene flow may be tested within an 
Approximate Bayesian Computations framework [95,96]. This 
approach would provide a longer-scale perspective on gene flow 
between species and its spatial and temporal variation. 

Genomic heterogeneity of gene flow within and 
between species 

Outlier loci were detected both within Lm and between Lm and 
Lv. Such candidate loci may signal various forms of selection 
acting on the markers themselves or at linked sites [24,38], 
Alternatively their apparent outlier status may result from 
violation of the model assumptions, to which the available 
methods are very sensitive [13]. We do not attempt a formal 
functional analysis of the identified outliers but rather emphasize 
that the outliers detected in the Lm-Lv comparison indicate 
heterogeneity of interspecific gene flow in nuclear protein coding 
genes. Dramatic discordance in the propensity for interspecific 
gene flow occurs between the mitochondrial and nuclear genome 
([29]; this study). Within the nuclear genome the genes of MHC 
class II introgress easily between the two species [45] . The present 
study suggests that heterogeneity of gene flow is widespread in the 
nuclear genome. Some genomic regions, typically linked to genes 
involved in intergenomic incompatibilities or underlying species- 
specific adaptations, i.e. genes which may cause reduced hybrid 
fitness, acquire reproductive isolation earlier than other regions 
[38,97,98]. The size of such regions and mechanisms responsible 
for maintenance of genomic differentiation have been a subject of 
ongoing controversy and intense recent research [85,98-101]. It is 
expected that the shape of the heterogeneity in gene flow wiU 
evolve over time and a comparison of the extent of heterogeneity 
at various stages of divergence is of great interest for the 
understanding of the buildup of genomic divergence as differen- 
tiation progresses [40,86]. Transcriptome data, such as those used 
here for the development of SNP markers, are being applied to 
study genomic heterogeneity of gene flow in the Lm/ Lv system 
(Stuglik et al. in prep.). 

Conclusions 

Using a panel of transcriptome-derived SNP markers, our study 
has demonstrated that isolation in glacial refugia and limited 
dispersal have been the main factors determining the genetic 
structure of Lm. Two substantially differentiated groups of Lm 
populations likely originated from separate refugia, both located in 
the Eastern Carpathians, The colonization of the present range in 
north-western and south-western directions was accompanied by a 
modest loss of variation. Local introgression of Lv alleles into 
several L?n populations was detected, Introgression was higher in 
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Lm populations classified as admixed than in syntopic populations. 
We discuss the possible causes of this discrepancy and propose 
further tests to distinguish between alternatives. Several outliers 
were identified in tests of interspecific differentiation, suggesting 
genomic heterogeneity of gene flow between species. The shape of 
genomic heterogeneity at various stages of species divergence is of 
major interest for the understanding of the buildup of difiFeren- 
tiation across the genome and Im/Lo is a promising study system 
in this respect. 

Supporting Information 

Figure SI Expected heterozygosity, (a) interpolated geo- 
graphic gradients in L. montandoni (Lm), (b) means for all 
populations: triangles - Lm, diamonds - syntopic, circles - Lv. 
(TIF) 

Figure S2 Histograms showing the distribution of 
pairwise F^t between populations within the northern 
and southern L. montandoni groups and between 
groups. 

(TIF) 

Figure S3 Identification of the number of groups (K) in 
Structure analysis for L. montandoni. (a) Evanno et al. 

(2005) method; (b) means and standard de\'iati()ns (SD) of the In- 
likelihood of the probability of data for various values of K. 
(TIF) 

Figure S4 Interpolated geographic gradients of allelic 
richness in L. montandoni, 

(TIF) 

Figure S5 Isolation by distance in L. montandoni. 

Relationships between pairwise -Fst and log-geographic distances 
are presented for all populations and populations within northern 
and southern groups separately. 
(TIF) 
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